Combining Outputs of Multiple Japanese Named Entity Chunkers by Stacking

نویسندگان

  • Takehito Utsuro
  • Manabu Sassano
  • Kiyotaka Uchimoto
چکیده

In this paper, we propose a method for learning a classifier which combines outputs of more than one Japanese named entity extractors. The proposed combination method belongs to the family of stacked generalizers, which is in principle a technique of combining outputs of several classifiers at the first stage by learning a second stage classifier to combine those outputs at the first stage. Individual models to be combined are based on maximum entropy models, one of which always considers surrounding contexts of a fixed length, while the other considers those of variable lengths according to the number of constituent morphemes of named entities. As an algorithm for learning the second stage classifier, we employ a decision list learning method. Experimental evaluation shows that the proposed method achieves improvement over the best known results with Japanese named entity extractors based on maximum entropy models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving the Performance of a Named Entity Extractor by Applying a Stacking Scheme

In this paper we investigate the way of improving the performance of a Named Entity Extraction (NEE) system by applying machine learning techniques and corpus transformation. The main resources used in our experiments are the publicly available tagger TnT and a corpus of Spanish texts in which named entities occurrences are tagged with BIO tags. We split the NEE task into two subtasks 1) Named ...

متن کامل

A Stacked, Voted, Stacked Model for Named Entity Recognition

This paper investigates stacking and voting methods for combining strong classifiers like boosting, SVM, and TBL, on the named-entity recognition task. We demonstrate several effective approaches, culminating in a model that achieves error rate reductions on the development and test sets of 63.6% and 55.0% (English) and 47.0% and 51.7% (German) over the CoNLL-2003 standard baseline respectively...

متن کامل

PAYMA: A Tagged Corpus of Persian Named Entities

The goal in the named entity recognition task is to classify proper nouns of a piece of text into classes such as person, location, and organization. Named entity recognition is an important preprocessing step in many natural language processing tasks such as question-answering and summarization. Although many research studies have been conducted in this area in English and the state-of-the-art...

متن کامل

Learning with Multiple Stacking for Named Entity Recognition

Then, given that the current position is at word W , the task is to assign tag T to W . In the named entity recognition task, an entity is often made up of a sequence of words, rather than a single word. For example, an entity “the United States of America” consists of five words. In order to allocate a tag to each word, the tags of the surrounding words (we call these tags the surrounding tags...

متن کامل

A Language Independent Method for Question Classification

Previous works on question classification are based on complex natural language processing techniques: named entity extractors, parsers, chunkers, etc. While these approaches have proven to be effective they have the disadvantage of being targeted to a particular language. We present here a simple approach that exploits lexical features and the Internet to train a classifier, namely a Support V...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002